IRIX Base Documentation 2002 November

home *** CD-ROM | disk | FTP | other *** search

/ IRIX Base Documentation 2002 November / SGI IRIX Base Documentation 2002 November.iso / usr / share / catman / u_man / cat1 / pmcd.z / pmcd

Wrap

Text File | 2002-10-03 | 43.6 KB | 793 lines

PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) NNNNAAAAMMMMEEEE ppppmmmmccccdddd - performance metrics collector daemon SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS ppppmmmmccccdddd [----ffff] [----iiii _i_p_a_d_d_r_e_s_s] [----llll _l_o_g_f_i_l_e] [----LLLL _b_y_t_e_s] [----nnnn _p_m_n_s_f_i_l_e] [----qqqq _t_i_m_e_o_u_t] [----TTTT _t_r_a_c_e_f_l_a_g] [----tttt _t_i_m_e_o_u_t] [----xxxx _f_i_l_e] DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN ppppmmmmccccdddd is the collector used by the Performance Co-Pilot (see PPPPCCCCPPPPIIIInnnnttttrrrroooo(1)) to gather performance metrics on a system. As a rule, there must be an instance of ppppmmmmccccdddd running on a system for any performance metrics to be available to the PCP. ppppmmmmccccdddd accepts connections from client applications running either on the same machine or remotely and provides them with metrics and other related information from the machine that ppppmmmmccccdddd is executing on. ppppmmmmccccdddd delegates most of this request servicing to a collection of Performance Metrics Domain Agents (or just agents), where each agent is responsible for a particular group of metrics, known as the domain of the agent. For example the eeeennnnvvvviiiirrrroooonnnn agent is responsible for reporting information relating to the environment of a Challenge system, such as the cabinet temperature and voltage levels of the power supply. The agents may be processes started by ppppmmmmccccdddd, independent processes or Dynamic Shared Objects (DSOs, see ddddssssoooo(5)) attached to ppppmmmmccccdddd's address space. The configuration section below describes how connections to agents are specified. The options to ppppmmmmccccdddd are as follows. ----ffff By default ppppmmmmccccdddd is started as a daemon. The ----ffff option indicates that it should run in the foreground. This is most useful when trying to diagnose problems with misbehaving agents. ----iiii _i_p_a_d_d_r_e_s_s This option is usually only used on hosts with more than one network interface. If no ----iiii options are specified ppppmmmmccccdddd accepts connections made to any of its host's IP (Internet Protocol) addresses. The ----iiii option is used to specify explicitly an IP address that connections should be accepted on. _i_p_a_d_d_r_e_s_s should be in the standard dotted form (e.g. 100.23.45.6). The ----iiii option may be used multiple times to define a list of IP addresses. Connections made to any other IP addresses the host has will be refused. This can be used to limit connections to one network interface if the host is a network gateway. It is also useful if the host takes over the IP address of another host that has failed. In such a situation only the standard IP addresses of the host should be given (not the ones inherited from the failed host). This allows PCP applications to determine that a host has failed, rather than connecting to the host that has assumed the identity of the failed host. PPPPaaaaggggeeee 1111 PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) ----llll _l_o_g_f_i_l_e By default a log file named _p_m_c_d._l_o_g is written in the directory $$$$PPPPCCCCPPPP____LLLLOOOOGGGG____DDDDIIIIRRRR////ppppmmmmccccdddd. The ----llll option causes the log file to be written to _l_o_g_f_i_l_e instead of the default. If the log file cannot be created or is not writable, output is written to the standard error instead. ----LLLL _b_y_t_e_s _P_D_Us received by ppppmmmmccccdddd from monitoring clients are restricted to a maximum size of 65536 bytes by default to defend against Denial of Service attacks. The ----LLLL option may be used to change the maximum incoming _P_D_U size. ----nnnn _p_m_n_s_f_i_l_e Normally ppppmmmmccccdddd loads the default Performance Metrics Name Space (PMNS) from $$$$PPPPCCCCPPPP____VVVVAAAARRRR____DDDDIIIIRRRR////ppppmmmmnnnnssss////rrrrooooooootttt, however if the ----nnnn option is specified an alternative namespace is loaded from the file _p_m_n_s_f_i_l_e. ----qqqq _t_i_m_e_o_u_t The pmcd to agent version exchange protocol (new in PCP 2.0 - introduced to provide backward compatibility) uses this timeout to specify how long pmcd should wait before assuming that no version response is coming from an agent. If this timeout is reached, the agent is assumed to be an agent which does not understand the PCP 2.0 protocol. The default timeout interval is five seconds, but the ----qqqq option allows an alternative timeout interval (which must be greater than zero) to be specified. The unit of time is seconds. ----tttt _t_i_m_e_o_u_t To prevent misbehaving agents from hanging the entire Performance Metrics Collection System (PMCS), ppppmmmmccccdddd uses timeouts on PDU exchanges with agents running as processes. By default the timeout interval is five seconds. The ----tttt option allows an alternative timeout interval in seconds to be specified. If _t_i_m_e_o_u_t is zero, timeouts are turned off. It is almost impossible to use the debugger interactively on an agent unless timeouts have been turned off for its "parent" ppppmmmmccccdddd. Once ppppmmmmccccdddd is running, the timeout may be dynamically modified by storing an integer value (the timeout in seconds) into the metric ppppmmmmccccdddd....ccccoooonnnnttttrrrroooollll....ttttiiiimmmmeeeeoooouuuutttt via ppppmmmmssssttttoooorrrreeee(1). ----TTTT _t_r_a_c_e_f_l_a_g To assist with error diagnosis for agents and/or clients of ppppmmmmccccdddd that are not behaving correctly, an internal event tracing mechanism is supported within ppppmmmmccccdddd. The value of _t_r_a_c_e_f_l_a_g is interpreted as a bit field with the following control functions: 1111 enable client connection tracing 2222 enable PDU tracing PPPPaaaaggggeeee 2222 PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) 222255556666 unbuffered event tracing By default, event tracing is buffered using a circular buffer that is over-written as new events are recorded. The default buffer size holds the last 20 events, although this number may be over-ridden by using ppppmmmmssssttttoooorrrreeee(1) to modify the metric ppppmmmmccccdddd....ccccoooonnnnttttrrrroooollll....ttttrrrraaaacccceeeebbbbuuuuffffssss. Similarly once ppppmmmmccccdddd is running, the event tracing control may be dynamically modified by storing 1 (enable) or 0 (disable) into the metrics ppppmmmmccccdddd....ccccoooonnnnttttrrrroooollll....ttttrrrraaaacccceeeeccccoooonnnnnnnn, ppppmmmmccccdddd....ccccoooonnnnttttrrrroooollll....ttttrrrraaaacccceeeeppppdddduuuu and ppppmmmmccccdddd....ccccoooonnnnttttrrrroooollll....ttttrrrraaaacccceeeennnnoooobbbbuuuuffff. These metrics map to the bit fields associated with the _t_r_a_c_e_f_l_a_g argument for the ----TTTT option. When operating in buffered mode, the event trace buffer will be dumped whenever an agent connection is terminated by ppppmmmmccccdddd, or when any value is stored into the metric ppppmmmmccccdddd....ccccoooonnnnttttrrrroooollll....dddduuuummmmppppttttrrrraaaacccceeee via ppppmmmmssssttttoooorrrreeee(1). In unbuffered mode, eeeevvvveeeerrrryyyy event will be reported when it occurs. ----xxxx _f_i_l_e Before the ppppmmmmccccdddd _l_o_g_f_i_l_e can be opened, ppppmmmmccccdddd may encounter a fatal error which prevents it from starting. By default, the output describing this error is sent to ////ddddeeeevvvv////ttttttttyyyy but it may redirected to _f_i_l_e. If a PDU exchange with an agent times out, the agent has violated the requirement that it delivers metrics with little or no delay. This is deemed a protocol failure and the agent is disconnected from ppppmmmmccccdddd. Any subsequent requests for information from the agent will fail with a status indicating that there is no agent to provide it. It is possible to specify host-level access control to ppppmmmmccccdddd. This allows one to prevent users from certain hosts from accessing the metrics provided by ppppmmmmccccdddd and is described in more detail in the Section on ACCESS CONTROL below. CCCCOOOONNNNFFFFIIIIGGGGUUUURRRRAAAATTTTIIIIOOOONNNN On startup ppppmmmmccccdddd looks for a configuration file named $_P_C_P__P_M_C_D_C_O_N_F__P_A_T_H. This file specifies which agents cover which performance metrics domains and how ppppmmmmccccdddd should make contact with the agents. An optional section specifying host-based access controls may follow the agent configuration data. WWWWaaaarrrrnnnniiiinnnngggg: ppppmmmmccccdddd is usually started as part of the boot sequence and runs as root. The configuration file may contain shell commands to create agents, which will be executed by root. To prevent security breaches the configuration file should be writable only by root. The use of absolute path names is also recommended. PPPPaaaaggggeeee 3333 PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) The case of the reserved words in the configuration file is unimportant, but elsewhere, the case is preserved. Blank lines and comments are permitted (even encouraged) in the configuration file. A comment begins with a ``#'' character and finishes at the end of the line. A line may be continued by ensuring that the last character on the line is a ``\'' (backslash). A comment on a continued line ends at the end of the continued line. Spaces may be included in lexical elements by enclosing the entire element in double quotes (there must be whitespace before the opening and after the closing quote). A double quote preceded by a backslash is always a literal double quote. A ``#'' in double quotes or preceded by a backslash is treated literally rather than as a comment delimiter. Lexical elements and separators are described further in the following sections. AAAAGGGGEEEENNNNTTTT CCCCOOOONNNNFFFFIIIIGGGGUUUURRRRAAAATTTTIIIIOOOONNNN Each line of the agent configuration section of the configuration file contains details of how to connect ppppmmmmccccdddd to one of its agents and specifies which metrics domain the agent deals with. An agent may be attached as a DSO, or via a socket, or a pair of pipes. Each line of the agent configuration section of the configuration file must be either an agent specification, a comment, or a blank line. Lexical elements are separated by whitespace characters, however a single agent specification may not be broken across lines unless a \\\\ (backslash) is used to continue the line. Each agent specification must start with a textual label (string) followed by an integer in the range 1 to 254. The label is a tag used to refer to the agent and the integer specifies the domain for which the agent supplies data. This domain identifier corresponds to the domain portion of the PMIDs handled by the agent. Each agent must have a unique label and domain identifier. For DSO agents a line of the form: _l_a_b_e_l _d_o_m_a_i_n-_n_o ddddssssoooo _e_n_t_r_y-_p_o_i_n_t _p_a_t_h should appear. Where, _l_a_b_e_l is a string identifying the agent _d_o_m_a_i_n-_n_o is an unsigned integer specifying the agent's domain in the range 1 to 254 _e_n_t_r_y-_p_o_i_n_t is the name of an initialization function which will be called when the DSO is loaded _p_a_t_h designates the location of the DSO. This field is treated differently on Irix and on Linux. Later expects it to be an absolute pathname, while former uses some heuristics to find an agent. If _p_a_t_h begins with a //// it is taken as an absolute path specifying the DSO. If _p_a_t_h is relative, ppppmmmmccccdddd will expect to find the agent in a file with the name mmmmiiiippppssss_____s_i_m_a_b_i...._p_a_t_h, where _s_i_m_a_b_i is either oooo33332222, nnnn33332222 or 66664444. PPPPaaaaggggeeee 4444 PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) ppppmmmmccccdddd is only able to load DSO agents that have the same _s_i_m_a_b_i (Subprogram Interface Model ABI, or calling conventions) as it does (i.e. only one of the _s_i_m_a_b_i versions will be applicable). The _s_i_m_a_b_i version of a running ppppmmmmccccdddd may be determined by fetching ppppmmmmccccdddd....ssssiiiimmmmaaaabbbbiiii. Alternatively, the ffffiiiilllleeee(1) command may be used to determine the _s_i_m_a_b_i version from the ppppmmmmccccdddd executable. For a relative _p_a_t_h the environment variable PPPPMMMMCCCCDDDD____PPPPAAAATTTTHHHH defines a colon (:) separated list of directories to search when trying to locate the agent DSO. The default search path is ////vvvvaaaarrrr////ppppccccpppp////lllliiiibbbb::::////uuuussssrrrr////ppppccccpppp////lllliiiibbbb. For agents providing socket connections, a line of the form _l_a_b_e_l _d_o_m_a_i_n-_n_o ssssoooocccckkkkeeeetttt _a_d_d_r-_f_a_m_i_l_y _a_d_d_r_e_s_s [ _c_o_m_m_a_n_d ] should appear. Where, _l_a_b_e_l is a string identifying the agent _d_o_m_a_i_n-_n_o is an unsigned integer specifying the agent's domain in the range 1 to 254 _a_d_d_r-_f_a_m_i_l_y designates whether the socket is in the AAAAFFFF____IIIINNNNEEEETTTT or AAAAFFFF____UUUUNNNNIIIIXXXX domain, and the corresponding values for this parameter are iiiinnnneeeetttt and uuuunnnniiiixxxx respectively. _a_d_d_r_e_s_s specifies the address of the socket within the previously specified _a_d_d_r-_f_a_m_i_l_y. For uuuunnnniiiixxxx sockets, the address should be the name of an agent's socket on the local host (a valid address for the UNIX domain). For iiiinnnneeeetttt sockets, the address may be either a port number or a port name which may be used to connect to an agent on the local host. There is no syntax for specifying an agent on a remote host as a ppppmmmmccccdddd deals only with agents on the same machine. _c_o_m_m_a_n_d is an optional parameter used to specify a command line to start the agent when ppppmmmmccccdddd initializes. If _c_o_m_m_a_n_d is not present, ppppmmmmccccdddd assumes that the specified agent has already been created. The _c_o_m_m_a_n_d is considered to start from the first non-white character after the socket address and finish at the next newline that isn't preceded by a backslash. After a ffffoooorrrrkkkk(2) the _c_o_m_m_a_n_d is passed unmodified to eeeexxxxeeeeccccvvvveeee(2) to instantiate the agent. For agents interacting with the ppppmmmmccccdddd via stdin/stdout, a line of the form: _l_a_b_e_l _d_o_m_a_i_n-_n_o ppppiiiippppeeee _p_r_o_t_o_c_o_l _c_o_m_m_a_n_d should appear. Where, _l_a_b_e_l is a string identifying the agent PPPPaaaaggggeeee 5555 PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) _d_o_m_a_i_n-_n_o is a unsigned integer specifying the agent's domain _p_r_o_t_o_c_o_l specifies whether a text-based (ASCII) or a binary protocol should be used over the pipes. The two valid values for this parameter are tttteeeexxxxtttt and bbbbiiiinnnnaaaarrrryyyy. NNNNooootttteeee: To the best of our knowledge, nothing but the demonstration PMDA news agent and the America's Cup San Diego water temperature agent has ever used the ASCII PDU interface to ppppmmmmccccdddd. The current PCP libraries (in particular _l_i_b_p_c_p__p_m_d_a and _l_i_b_p_c_p__t_r_a_c_e) make building a real PMDA less effort than fighting with the ASCII PDUs in a sssshhhh(1) script. Consequently, support for ASCII PDUs and hence the keyword tttteeeexxxxtttt in the ppppmmmmccccdddd configuration file is discouraged. _c_o_m_m_a_n_d specifies a command line to start the agent when ppppmmmmccccdddd initializes. Note that _c_o_m_m_a_n_d is mandatory for pipe-based agents. The _c_o_m_m_a_n_d is considered to start from the first non-white character after the _p_r_o_t_o_c_o_l parameter and finish at the next newline that isn't preceded by a backslash. After a ffffoooorrrrkkkk(2) the _c_o_m_m_a_n_d is passed unmodified to eeeexxxxeeeeccccvvvveeee(2) to instantiate the agent. AAAACCCCCCCCEEEESSSSSSSS CCCCOOOONNNNTTTTRRRROOOOLLLL CCCCOOOONNNNFFFFIIIIGGGGUUUURRRRAAAATTTTIIIIOOOONNNN The access control section of the configuration file is optional, but if present it must follow the agent configuration data. The case of reserved words is ignored, but elsewhere case is preserved. Lexical elements in the access control section are separated by whitespace or the special delimiter characters: square brackets (``['' and ``]''), braces (``{'' and ``}''), colon (``:''), semicolon (``;'') and comma (``,''). The special characters are not treated as special in the agent configuration section. The access control section of the file must start with a line of the form: [[[[aaaacccccccceeeessssssss]]]] Leading and trailing whitespace may appear around and within the brackets and the case of the aaaacccccccceeeessssssss keyword is ignored. No other text may appear on the line except a trailing comment. Following this line, the remainder of the configuration file should contain lines that allow or disallow operations from particular hosts or groups of hosts. There are two kinds of operations that occur via ppppmmmmccccdddd: ffffeeeettttcccchhhh allows retrieval of information from ppppmmmmccccdddd. This may be information about a metric (e.g. it's description, instance domain or help text) or a value for a metric. PPPPaaaaggggeeee 6666 PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) ssssttttoooorrrreeee allows ppppmmmmccccdddd to be used to store metric values in agents that permit store operations. Access to ppppmmmmccccdddd is granted at the host level, i.e. all users on a host are granted the same level of access. Permission to perform the ssssttttoooorrrreeee operation should not be given indiscriminately; it has the potential to be abused by malicious users. Hosts may be identified by name, IP address or a wildcarded IP address with the single wildcard character ``*'' as the last-given component of the IP address. Host names may not be wildcarded. The following are all valid host identifiers: boing localhost giggle.melbourne.sgi.com 129.127.112.2 129.127.114.* 129.* * The following are not valid host identifiers: *.melbourne 129.127.*.* 129.*.114.9 129.127* The first example is not allowed because only (numeric) IP addresses may contain a wildcard. The second example is not valid because there is more than one wildcard character. The third contains an embedded wildcard, the fourth has a wildcard character that is not the last component of the IP address (the last component is 127*). The name llllooooccccaaaallllhhhhoooosssstttt is given special treatment to make the behavior of host wildcarding consistent. Rather than being 127.0.0.1, it is mapped to the primary IP address associated with the name of the host on which ppppmmmmccccdddd is running. Beware of this when running ppppmmmmccccdddd on multi-homed hosts. Access for hosts are allowed or disallowed by specifying statements of the form: aaaalllllllloooowwww _h_o_s_t_l_i_s_t :::: _o_p_e_r_a_t_i_o_n_s ;;;; ddddiiiissssaaaalllllllloooowwww _h_o_s_t_l_i_s_t :::: _o_p_e_r_a_t_i_o_n_s ;;;; _h_o_s_t_l_i_s_t is a comma separated list of host identifiers. _o_p_e_r_a_t_i_o_n_s is a comma separated list of the operation types described above, aaaallllllll (which allows/disallows all operations), or aaaallllllll eeeexxxxcccceeeepppptttt _o_p_e_r_a_t_i_o_n_s (which allows/disallows all operations except those listed). PPPPaaaaggggeeee 7777 PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) Where no specific aaaalllllllloooowwww or ddddiiiissssaaaalllllllloooowwww statement applies to an operation for some host, the default is to allow the operation from that host. In the trivial case when there is no access control section in the configuration file, all operations from all hosts are permitted. If a new connection to ppppmmmmccccdddd is attempted from a host that is not permitted to perform any operations, the connection will be closed immediately after an error response PPPPMMMM____EEEERRRRRRRR____PPPPEEEERRRRMMMMIIIISSSSSSSSIIIIOOOONNNN has been sent to the client attempting the connection. Statements with the same level of wildcarding specifying identical hosts may not contradict each other. For example if a host named ccccllllaaaannnnkkkk had an IP address of 129.127.112.2, specifying the following two rules would be erroneous: allow clank : fetch, store; disallow 129.127.112.2 : all except fetch; because they both refer to the same host, but disagree as to whether the ffffeeeettttcccchhhh operation is permitted from that host. Statements containing more specific host specifications override less specific ones according to the level of wildcarding. For example a rule of the form allow clank : all; overrides disallow 129.127.112.* : all except fetch; because the former contains a specific host name (equivalent to a fully specified IP address), whereas the latter has a wildcard. In turn, the latter would override disallow * : all; It is possible to limit the number of connections from a host to ppppmmmmccccdddd. This may be done by adding a clause of the form mmmmaaaaxxxxiiiimmmmuuuummmm _n ccccoooonnnnnnnneeeeccccttttiiiioooonnnnssss to the _o_p_e_r_a_t_i_o_n_s list of an aaaalllllllloooowwww statement. Such a clause may not be used in a ddddiiiissssaaaalllllllloooowwww statement. Here, _n is the maximum number of connections that will be accepted from hosts matching the host identifier(s) used in the statement. An access control statement with a list of host identifiers is equivalent to a group of access control statements, with each specifying one of the host identifiers in the list and all with the same access controls (both permissions and connection limits). A wildcard should be used if you want hosts to contribute to a shared connection limit. PPPPaaaaggggeeee 8888 PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) When a new client requests a connection, and ppppmmmmccccdddd has determined that the client has permission to connect, it searches the matching list of access control statements for the most specific match containing a connection limit. For brevity, this will be called the limiting statement. If there is no limiting statement, the client is granted a connection. If there is a limiting statement and the number of ppppmmmmccccdddd clients with IP addresses that match the host identifier in the limiting statement is less than the connection limit in the statement, the connection is allowed. Otherwise the connection limit has been reached and the client is refused a connection. The wildcarding in host identifiers means that once ppppmmmmccccdddd actually accepts a connection from a client, the connection may contribute to the current connection count of more than one access control statement (the client's host may match more than one access control statement). This may be significant for subsequent connection requests. Note that because most specific match semantics are used when checking the connection limit, priority is given to clients with more specific host identifiers. It is also possible to exceed connection limits in some situations. Consider the following: allow clank : all, maximum 5 connections; allow * : all except store, maximum 2 connections; This says that only 2 client connections at a time are permitted for all hosts other than "clank", which is permitted 5. If a client from host "boing" is the first to connect to ppppmmmmccccdddd, it's connection is checked against the second statement (that is the most specific match with a connection limit). As there are no other clients, the connection is accepted and contributes towards the limit for only the second statement above. If the next client connects from "clank", its connection is checked against the limit for the first statement. There are no other connections from "clank", so the connection is accepted. Once this connection is accepted, it counts towards bbbbooootttthhhh statements' limits because "clank" matches the host identifier in both statements. Remember that the decision to accept a new connection is made using only the most specific matching access control statement with a connection limit. Now, the connection limit for the second statement has been reached. Any connections from hosts other than "clank" will be refused. If instead, ppppmmmmccccdddd with no clients saw three successive connections arrived from "boing", the first two would be accepted and the third refused. After that, if a connection was requested from "clank" it would be accepted. It matches the first statement, which is more specific than the second, so the connection limit in the first is used to determine that the client has the right to connect. Now there are 3 connections contributing to the second statement's connection limit. Even though the connection limit for the second statement has been exceeded, the earlier connections from "boing" are maintained. The connection limit is only checked at the time a client attempts a connection rather than being re- evaluated every time a new client connects to ppppmmmmccccdddd. PPPPaaaaggggeeee 9999 PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) This gentle scheme is designed to allow reasonable limits to be imposed on a first come first served basis, with specific exceptions. As illustrated by the example above, a client's connection is honored once it has been accepted. However, ppppmmmmccccdddd reconfiguration (see the next section) re-evaluates all the connection counts and will cause client connections to be dropped where connection limits have been exceeded. RRRREEEECCCCOOOONNNNFFFFIIIIGGGGUUUURRRRIIIINNNNGGGG PPPPMMMMCCCCDDDD If the configuration file has been changed or if an agent is not responding because it has terminated or the PMNS has been changed, ppppmmmmccccdddd may be reconfigured by sending it a SIGHUP, as in # killall -HUP pmcd When ppppmmmmccccdddd receives a SIGHUP, it checks the configuration file for changes. If the file has been modified, it is reparsed and the contents become the new configuration. If there are errors in the configuration file, the existing configuration is retained and the contents of the file are ignored. Errors are reported in the ppppmmmmccccdddd log file. It also checks the PMNS file for changes. If the PMNS file has been modified, then it is reloaded. Use of ttttaaaaiiiillll(1) on the log file is recommended while reconfiguring ppppmmmmccccdddd. If the configuration for an agent has changed (any parameter except the agent's label is different), the agent is restarted. Agents whose configurations do not change are not restarted. Any existing agents not present in the new configuration are terminated. Any deceased agents are that are still listed are restarted. Sometimes it is necessary to restart an agent that is still running, but malfunctioning. Simply kill the agent, then send ppppmmmmccccdddd a SIGHUP, which will cause the agent to be restarted. SSSSTTTTAAAARRRRTTTTIIIINNNNGGGG AAAANNNNDDDD SSSSTTTTOOOOPPPPPPPPIIIINNNNGGGG PPPPMMMMCCCCDDDD Normally, ppppmmmmccccdddd is started automatically at boot time and stopped when the system is being brought down (see rrrrcccc2222(1M) and rrrrcccc0000(1M)). Under certain circumstances it is necessary to start or stop ppppmmmmccccdddd manually. To do this one must become superuser and type # $PCP_RC_DIR/pcp start to start ppppmmmmccccdddd, or # $PCP_RC_DIR/pcp stop to stop ppppmmmmccccdddd. Starting ppppmmmmccccdddd when it is already running is the same as stopping it and then starting it again. PPPPaaaaggggeeee 11110000 PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) Sometimes it may be necessary to restart ppppmmmmccccdddd during another phase of the boot process. Time-consuming parts of the boot process are often put into the background to allow the system to become available sooner (e.g. mounting huge databases). If an agent run by ppppmmmmccccdddd requires such a task to complete before it can run properly, it is necessary to restart or reconfigure ppppmmmmccccdddd after the task completes. Consider, for example, the case of mounting a database in the background while booting. If the PMDA which provides the metrics about the database cannot function until the database is mounted and available but ppppmmmmccccdddd is started before the database is ready, the PMDA will fail (however ppppmmmmccccdddd will still service requests for metrics from other domains). If the database is initialized by running a shell script, adding a line to the end of the script to reconfigure ppppmmmmccccdddd (by sending it a SIGHUP) will restart the PMDA (if it exited because it couldn't connect to the database). If the PMDA didn't exit in such a situation it would be necessary to restart ppppmmmmccccdddd because if the PMDA was still running ppppmmmmccccdddd would not restart it. Normally ppppmmmmccccdddd listens for client connections on TCP/IP port number 4321. The environment variable PPPPMMMMCCCCDDDD____PPPPOOOORRRRTTTT may be used to specify an alternative port number. If PPPPMMMMCCCCDDDD____PPPPOOOORRRRTTTT is used, care should be taken to ensure the environment variable is set before ppppmmmmccccdddd is started, and also in the environment of any client application that will connect to ppppmmmmccccdddd. LLLLIIIICCCCEEEENNNNSSSSEEEESSSS In previous PCP releases, ppppmmmmccccdddd would terminate immediately if there was no valid _C_o_l_l_e_c_t_o_r license on the localhost. This has now changed so that on Irix ppppmmmmccccdddd will run on hosts without a _C_o_l_l_e_c_t_o_r license, however an unlicensed ppppmmmmccccdddd will only accept connections from authorized clients. On Linux ppppmmmmccccdddd will run on any host without a license and will accept connections from any client. Not all PCP tools are authorized clients. See the PCP release notes for more details about licenses for PCP. FFFFIIIILLLLEEEESSSS $_P_C_P__P_M_C_D_C_O_N_F__P_A_T_H default configuration file $_P_C_P__P_M_C_D_O_P_T_I_O_N_S__P_A_T_H command line options to ppppmmmmccccdddd when launched from $$$$PPPPCCCCPPPP____RRRRCCCC____DDDDIIIIRRRR////ppppccccpppp All the command line option lines should start with a hyphen as the first character. This file can also contain environment variable settings of the form "VARIABLE=value". ....////ppppmmmmccccdddd....lllloooogggg (or $$$$PPPPCCCCPPPP____LLLLOOOOGGGG____DDDDIIIIRRRR////ppppmmmmccccdddd////ppppmmmmccccdddd....lllloooogggg when started automatically) All messages and diagnostics are directed here EEEENNNNVVVVIIIIRRRROOOONNNNMMMMEEEENNNNTTTT In addition to the PCP environment variables described in the PPPPCCCCPPPP EEEENNNNVVVVIIIIRRRROOOONNNNMMMMEEEENNNNTTTT section below, the PPPPMMMMCCCCDDDD____PPPPOOOORRRRTTTT variable is also recognised as the TCP/IP port for incoming connections (default _4_3_2_1). PPPPaaaaggggeeee 11111111 PPPPMMMMCCCCDDDD((((1111)))) PPPPMMMMCCCCDDDD((((1111)))) PCP ENVIRONMENT Environment variables with the prefix PPPPCCCCPPPP____ are used to parameterize the file and directory names used by PCP. On each installation, the file ////eeeettttcccc////ppppccccpppp....ccccoooonnnnffff contains the local values for these variables. The $$$$PPPPCCCCPPPP____CCCCOOOONNNNFFFF variable may be used to specify an alternative configuration file, as described in ppppccccpppp....ccccoooonnnnffff(4). SSSSEEEEEEEE AAAALLLLSSSSOOOO PPPPCCCCPPPPIIIInnnnttttrrrroooo(1), ppppmmmmddddbbbbgggg(1), ppppmmmmeeeerrrrrrrr(1), ppppmmmmggggeeeennnnmmmmaaaapppp(1), ppppmmmmiiiinnnnffffoooo(1), ppppmmmmkkkkssssttttaaaatttt(1), ppppmmmmssssttttoooorrrreeee(1), ppppmmmmvvvvaaaallll(1), ppppccccpppp....ccccoooonnnnffff(4), ppppccccpppp....eeeennnnvvvv(4) and ddddssssoooo(5). DDDDIIIIAAAAGGGGNNNNOOOOSSSSTTTTIIIICCCCSSSS If ppppmmmmccccdddd is already running the message "Error: OpenRequestSocket bind: Address already in use" will appear. This may also appear if ppppmmmmccccdddd was shutdown with an outstanding request from a client. In this case, a request socket has been left in the TIME_WAIT state and until the system closes it down (after some timeout period) it will not be possible to run ppppmmmmccccdddd. In addition to the standard PPPPCCCCPPPP debugging flags, see ppppmmmmddddbbbbgggg(1), ppppmmmmccccdddd currently uses DDDDBBBBGGGG____TTTTRRRRAAAACCCCEEEE____AAAAPPPPPPPPLLLL0000 for tracing I/O and termination of agents, DDDDBBBBGGGG____TTTTRRRRAAAACCCCEEEE____AAAAPPPPPPPPLLLL1111 for tracing host access control (see below) and DDDDBBBBGGGG____TTTTRRRRAAAACCCCEEEE____AAAAPPPPPPPPLLLL2222 for tracing the configuration file scanner and parser. CCCCAAAAVVVVEEEEAAAATTTTSSSS ppppmmmmccccdddd does not kill its child agents, it only closes their pipes. If an agent never checks for a closed pipe it may not terminate. The configuration file parser will only read lines of less than 1200 characters. This is intended to prevent accidents with binary files. The timeouts controlled by the ----tttt option apply to IPC between ppppmmmmccccdddd and the PMDAs it spawns. This is independent of settings of the environment variables PPPPMMMMCCCCDDDD____CCCCOOOONNNNNNNNEEEECCCCTTTT____TTTTIIIIMMMMEEEEOOOOUUUUTTTT and PPPPMMMMCCCCDDDD____RRRREEEEQQQQUUUUEEEESSSSTTTT____TTTTIIIIMMMMEEEEOOOOUUUUTTTT (see PPPPCCCCPPPPIIIInnnnttttrrrroooo(1)) which may be used respectively to control timeouts for client applications trying to connect to ppppmmmmccccdddd and trying to receive information from ppppmmmmccccdddd. PPPPaaaaggggeeee 11112222